150        Bioinformatics

java -jar snpEff/snpEff.jar download GRCh38.99

To list all available SnpEff database, run the following command:

java -jar snpEff/snpEff.jar databases

The “snpeff” is the current directory that we created to store the snpEff software, the VCF

file, and the output. We will copy our VCF file to the root directory “snpeff”, while the

snpEff executable file and database are in “snpEff”. After copying our VCF file “humanSNP.

vcf” into the working directory, you can annotate it using the following command:

java -Xmx8g -jar snpEff/snpEff.jar GRCh38.99 humanSNP.vcf >

mySNPanot.vcf

This command will produce three files: a VCF file (mySNPanot.vcf), gene file (snpEff_

genes.txt), and summary file in html format (snpEff_summary.html). SnpEff adds func-

tional annotations in the ANN keyword in the INFO field of the VCF output file. Figure

4.12 shows the VCF output file, which is modified to show ANN under INFO field. The

INFO field may include the effect of the variant (stop loss, stop gain, etc.), effect impact

on gene (High, Moderate, Low, or Modifier), or functional class of the variant (nonsense,

missense, frameshift, etc.).

Moreover, we can view the summary on the html file to have a general idea about the

type and regions and effects of the variants. If you have “firefox” installed, you can display

the summary on the html file using the “firefox” command or you can open it with an

Internet browser.

firefox snpEff_summary.html

Figure 4.13 shows the summary of the annotation using SnpEff and variant rate details.

Remember that the VCF file contains the variants of the human chromosome 21 only.

Figure 4.14 shows the number of variant effects by impact and by functional class. Only

68 SNVs (0.009%) have high impact. The remaining variants are SNV with moderate

impact (0.149%), SNV with low impact (0.149), and modifier (99.575%).

FIGURE 4.12  A VCF annotated with SnpEff.